Dataset statistics
| Number of variables | 11 |
|---|---|
| Number of observations | 775 |
| Missing cells | 1742 |
| Missing cells (%) | 20.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 66.7 KiB |
| Average record size in memory | 88.2 B |
Variable types
| NUM | 10 |
|---|---|
| CAT | 1 |
Reproduction
| Analysis started | 2020-11-13 02:01:18.124939 |
|---|---|
| Analysis finished | 2020-11-13 02:01:46.573587 |
| Duration | 28.45 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
homePrice_index is highly correlated with cpi_rent and 1 other fields | High correlation |
cpi_rent is highly correlated with homePrice_index and 1 other fields | High correlation |
ppi_resConstruct is highly correlated with cpi_rent and 1 other fields | High correlation |
uspop_growth has 60 (7.7%) missing values | Missing |
med_hIncome has 355 (45.8%) missing values | Missing |
homePrice_index has 373 (48.1%) missing values | Missing |
newHouse_starts has 36 (4.6%) missing values | Missing |
ppi_resConstruct has 365 (47.1%) missing values | Missing |
resConstruct_spending has 553 (71.4%) missing values | Missing |
DATE has unique values | Unique |
| Distinct count | 775 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.1 KiB |
| 1998-06-01 | 1 |
|---|---|
| 2017-06-01 | 1 |
| 2017-03-01 | 1 |
| 2015-02-01 | 1 |
| 1993-03-01 | 1 |
| Other values (770) |
| Value | Count | Frequency (%) | |
| 1998-06-01 | 1 | 0.1% | |
| 2017-06-01 | 1 | 0.1% | |
| 2017-03-01 | 1 | 0.1% | |
| 2015-02-01 | 1 | 0.1% | |
| 1993-03-01 | 1 | 0.1% | |
| 1964-09-01 | 1 | 0.1% | |
| 1986-01-01 | 1 | 0.1% | |
| 1984-11-01 | 1 | 0.1% | |
| 1987-10-01 | 1 | 0.1% | |
| 2007-03-01 | 1 | 0.1% | |
| Other values (765) | 765 | 98.7% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
| Distinct count | 59 |
|---|---|
| Unique (%) | 8.3% |
| Missing | 60 |
| Missing (%) | 7.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.0067087432455932 |
|---|---|
| Minimum | 0.473953539373292 |
| Maximum | 1.6577300373895298 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 0.4739535394 |
|---|---|
| 5-th percentile | 0.6310078932 |
| Q1 | 0.893829201 |
| median | 0.9642539171 |
| Q3 | 1.16341162 |
| 95-th percentile | 1.404081667 |
| Maximum | 1.657730037 |
| Range | 1.183776498 |
| Interquartile range (IQR) | 0.2695824189 |
Descriptive statistics
| Standard deviation | 0.2364318167 |
|---|---|
| Coefficient of variation (CV) | 0.2348562266 |
| Kurtosis | 0.3601088 |
| Mean | 1.006708743 |
| Median Absolute Deviation (MAD) | 0.1364078754 |
| Skewness | 0.2397468682 |
| Sum | 719.7967514 |
| Variance | 0.05590000395 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0.4739535394 | 19 | 2.5% | |
| 0.9217131672 | 12 | 1.5% | |
| 0.9241641571 | 12 | 1.5% | |
| 0.9897413822 | 12 | 1.5% | |
| 0.8658173363 | 12 | 1.5% | |
| 0.9254839689 | 12 | 1.5% | |
| 1.389046055 | 12 | 1.5% | |
| 0.9595899228 | 12 | 1.5% | |
| 1.439164762 | 12 | 1.5% | |
| 1.250171646 | 12 | 1.5% | |
| Other values (49) | 588 | 75.9% | |
| (Missing) | 60 | 7.7% |
| Value | Count | Frequency (%) | |
| 0.4739535394 | 19 | 2.5% | |
| 0.5223373579 | 12 | 1.5% | |
| 0.6310078932 | 12 | 1.5% | |
| 0.6867731556 | 12 | 1.5% | |
| 0.7166694134 | 12 | 1.5% |
| Value | Count | Frequency (%) | |
| 1.657730037 | 12 | 1.5% | |
| 1.537997358 | 12 | 1.5% | |
| 1.439164762 | 12 | 1.5% | |
| 1.389046055 | 12 | 1.5% | |
| 1.386885692 | 12 | 1.5% |
| Distinct count | 35 |
|---|---|
| Unique (%) | 8.3% |
| Missing | 355 |
| Missing (%) | 45.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 57691.8 |
|---|---|
| Minimum | 51742.0 |
| Maximum | 63179.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 51742 |
|---|---|
| 5-th percentile | 52709 |
| Q1 | 55716 |
| median | 57856 |
| Q3 | 60038 |
| 95-th percentile | 62626 |
| Maximum | 63179 |
| Range | 11437 |
| Interquartile range (IQR) | 4322 |
Descriptive statistics
| Standard deviation | 2906.43916 |
|---|---|
| Coefficient of variation (CV) | 0.0503787221 |
| Kurtosis | -0.8768863598 |
| Mean | 57691.8 |
| Median Absolute Deviation (MAD) | 2140 |
| Skewness | -0.04949544534 |
| Sum | 24230556 |
| Variance | 8447388.59 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 55931 | 12 | 1.5% | |
| 51742 | 12 | 1.5% | |
| 55260 | 12 | 1.5% | |
| 54233 | 12 | 1.5% | |
| 59286 | 12 | 1.5% | |
| 61399 | 12 | 1.5% | |
| 61526 | 12 | 1.5% | |
| 61779 | 12 | 1.5% | |
| 60178 | 12 | 1.5% | |
| 54608 | 12 | 1.5% | |
| Other values (25) | 300 | 38.7% | |
| (Missing) | 355 | 45.8% |
| Value | Count | Frequency (%) | |
| 51742 | 12 | 1.5% | |
| 52709 | 12 | 1.5% | |
| 53610 | 12 | 1.5% | |
| 53897 | 12 | 1.5% | |
| 54233 | 12 | 1.5% |
| Value | Count | Frequency (%) | |
| 63179 | 12 | 1.5% | |
| 62626 | 12 | 1.5% | |
| 61779 | 12 | 1.5% | |
| 61526 | 12 | 1.5% | |
| 61399 | 12 | 1.5% |
rentl_vacnyRate
Real number (ℝ≥0)
| Distinct count | 58 |
|---|---|
| Unique (%) | 7.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.33741935483871 |
|---|---|
| Minimum | 5.0 |
| Maximum | 11.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 5.17 |
| Q1 | 6 |
| median | 7.4 |
| Q3 | 8.2 |
| 95-th percentile | 10.1 |
| Maximum | 11.1 |
| Range | 6.1 |
| Interquartile range (IQR) | 2.2 |
Descriptive statistics
| Standard deviation | 1.505817675 |
|---|---|
| Coefficient of variation (CV) | 0.205224426 |
| Kurtosis | -0.6948940694 |
| Mean | 7.337419355 |
| Median Absolute Deviation (MAD) | 1.1 |
| Skewness | 0.3129023474 |
| Sum | 5686.5 |
| Variance | 2.267486872 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 7.7 | 30 | 3.9% | |
| 7 | 30 | 3.9% | |
| 5.3 | 30 | 3.9% | |
| 8.2 | 27 | 3.5% | |
| 7.4 | 27 | 3.5% | |
| 7.5 | 27 | 3.5% | |
| 7.3 | 27 | 3.5% | |
| 5.7 | 25 | 3.2% | |
| 8 | 24 | 3.1% | |
| 5.5 | 24 | 3.1% | |
| Other values (48) | 504 | 65.0% |
| Value | Count | Frequency (%) | |
| 5 | 21 | 2.7% | |
| 5.1 | 18 | 2.3% | |
| 5.2 | 9 | 1.2% | |
| 5.3 | 30 | 3.9% | |
| 5.4 | 18 | 2.3% |
| Value | Count | Frequency (%) | |
| 11.1 | 3 | 0.4% | |
| 10.7 | 3 | 0.4% | |
| 10.6 | 9 | 1.2% | |
| 10.4 | 3 | 0.4% | |
| 10.3 | 3 | 0.4% |
unemplt_rate
Real number (ℝ≥0)
| Distinct count | 74 |
|---|---|
| Unique (%) | 9.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.941161290322581 |
|---|---|
| Minimum | 3.4 |
| Maximum | 14.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 3.4 |
|---|---|
| 5-th percentile | 3.7 |
| Q1 | 4.8 |
| median | 5.6 |
| Q3 | 7 |
| 95-th percentile | 9.33 |
| Maximum | 14.7 |
| Range | 11.3 |
| Interquartile range (IQR) | 2.2 |
Descriptive statistics
| Standard deviation | 1.664338279 |
|---|---|
| Coefficient of variation (CV) | 0.2801368617 |
| Kurtosis | 1.203190535 |
| Mean | 5.94116129 |
| Median Absolute Deviation (MAD) | 1.1 |
| Skewness | 0.9372684244 |
| Sum | 4604.4 |
| Variance | 2.770021905 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 5.4 | 32 | 4.1% | |
| 5.7 | 31 | 4.0% | |
| 5.6 | 31 | 4.0% | |
| 5.9 | 26 | 3.4% | |
| 5.5 | 24 | 3.1% | |
| 5 | 23 | 3.0% | |
| 3.8 | 23 | 3.0% | |
| 5.2 | 22 | 2.8% | |
| 6 | 20 | 2.6% | |
| 4.9 | 20 | 2.6% | |
| Other values (64) | 523 | 67.5% |
| Value | Count | Frequency (%) | |
| 3.4 | 9 | 1.2% | |
| 3.5 | 12 | 1.5% | |
| 3.6 | 5 | 0.6% | |
| 3.7 | 14 | 1.8% | |
| 3.8 | 23 | 3.0% |
| Value | Count | Frequency (%) | |
| 14.7 | 1 | 0.1% | |
| 13.3 | 1 | 0.1% | |
| 11.1 | 1 | 0.1% | |
| 10.8 | 2 | 0.3% | |
| 10.4 | 3 | 0.4% |
int_rate
Real number (ℝ≥0)
| Distinct count | 139 |
|---|---|
| Unique (%) | 17.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.579109677419355 |
|---|---|
| Minimum | 0.25 |
| Maximum | 14.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 0.25 |
|---|---|
| 5-th percentile | 0.75 |
| Q1 | 2.75 |
| median | 4.5 |
| Q3 | 6 |
| 95-th percentile | 9.683 |
| Maximum | 14 |
| Range | 13.75 |
| Interquartile range (IQR) | 3.25 |
Descriptive statistics
| Standard deviation | 2.825039179 |
|---|---|
| Coefficient of variation (CV) | 0.6169407108 |
| Kurtosis | 0.9312057581 |
| Mean | 4.579109677 |
| Median Absolute Deviation (MAD) | 1.5 |
| Skewness | 0.8509274036 |
| Sum | 3548.81 |
| Variance | 7.980846364 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3 | 77 | 9.9% | |
| 0.75 | 71 | 9.2% | |
| 6 | 47 | 6.1% | |
| 5 | 42 | 5.4% | |
| 4.5 | 39 | 5.0% | |
| 5.5 | 36 | 4.6% | |
| 3.5 | 32 | 4.1% | |
| 4 | 29 | 3.7% | |
| 5.25 | 24 | 3.1% | |
| 7 | 22 | 2.8% | |
| Other values (129) | 356 | 45.9% |
| Value | Count | Frequency (%) | |
| 0.25 | 5 | 0.6% | |
| 0.5 | 14 | 1.8% | |
| 0.75 | 71 | 9.2% | |
| 0.83 | 1 | 0.1% | |
| 1 | 12 | 1.5% |
| Value | Count | Frequency (%) | |
| 14 | 5 | 0.6% | |
| 13.87 | 1 | 0.1% | |
| 13.03 | 1 | 0.1% | |
| 13 | 6 | 0.8% | |
| 12.94 | 1 | 0.1% |
| Distinct count | 684 |
|---|---|
| Unique (%) | 88.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 139.36684129032258 |
|---|---|
| Minimum | 35.9 |
| Maximum | 341.95 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 35.9 |
|---|---|
| 5-th percentile | 38.07 |
| Q1 | 49.8 |
| median | 126.6 |
| Q3 | 210.45 |
| 95-th percentile | 305.7476 |
| Maximum | 341.95 |
| Range | 306.05 |
| Interquartile range (IQR) | 160.65 |
Descriptive statistics
| Standard deviation | 90.4683866 |
|---|---|
| Coefficient of variation (CV) | 0.6491385308 |
| Kurtosis | -0.9653871985 |
| Mean | 139.3668413 |
| Median Absolute Deviation (MAD) | 78.5 |
| Skewness | 0.5150686991 |
| Sum | 108009.302 |
| Variance | 8184.528975 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 40 | 5 | 0.6% | |
| 39.2 | 5 | 0.6% | |
| 40.3 | 4 | 0.5% | |
| 37.6 | 4 | 0.5% | |
| 40.9 | 4 | 0.5% | |
| 38.7 | 4 | 0.5% | |
| 41.4 | 4 | 0.5% | |
| 38.1 | 4 | 0.5% | |
| 36.7 | 4 | 0.5% | |
| 40.5 | 4 | 0.5% | |
| Other values (674) | 733 | 94.6% |
| Value | Count | Frequency (%) | |
| 35.9 | 3 | 0.4% | |
| 36 | 1 | 0.1% | |
| 36.1 | 1 | 0.1% | |
| 36.2 | 1 | 0.1% | |
| 36.4 | 2 | 0.3% |
| Value | Count | Frequency (%) | |
| 341.95 | 1 | 0.1% | |
| 341.294 | 1 | 0.1% | |
| 340.811 | 1 | 0.1% | |
| 340.135 | 1 | 0.1% | |
| 339.519 | 1 | 0.1% |
| Distinct count | 399 |
|---|---|
| Unique (%) | 99.3% |
| Missing | 373 |
| Missing (%) | 48.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 128.83690298507463 |
|---|---|
| Minimum | 63.755 |
| Maximum | 219.819 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 63.755 |
|---|---|
| 5-th percentile | 72.59445 |
| Q1 | 80.80025 |
| median | 134.9045 |
| Q3 | 167.30875 |
| 95-th percentile | 205.0559 |
| Maximum | 219.819 |
| Range | 156.064 |
| Interquartile range (IQR) | 86.5085 |
Descriptive statistics
| Standard deviation | 45.68944536 |
|---|---|
| Coefficient of variation (CV) | 0.3546301122 |
| Kurtosis | -1.343375444 |
| Mean | 128.836903 |
| Median Absolute Deviation (MAD) | 46.4595 |
| Skewness | 0.1845956994 |
| Sum | 51792.435 |
| Variance | 2087.525417 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 166.675 | 2 | 0.3% | |
| 78.175 | 2 | 0.3% | |
| 76.599 | 2 | 0.3% | |
| 127.15 | 1 | 0.1% | |
| 149.625 | 1 | 0.1% | |
| 182.722 | 1 | 0.1% | |
| 90.885 | 1 | 0.1% | |
| 176.637 | 1 | 0.1% | |
| 159.38 | 1 | 0.1% | |
| 140.164 | 1 | 0.1% | |
| Other values (389) | 389 | 50.2% | |
| (Missing) | 373 | 48.1% |
| Value | Count | Frequency (%) | |
| 63.755 | 1 | 0.1% | |
| 64.156 | 1 | 0.1% | |
| 64.491 | 1 | 0.1% | |
| 64.994 | 1 | 0.1% | |
| 65.568 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 219.819 | 1 | 0.1% | |
| 218.6 | 1 | 0.1% | |
| 217.323 | 1 | 0.1% | |
| 215.16 | 1 | 0.1% | |
| 213.255 | 1 | 0.1% |
| Distinct count | 572 |
|---|---|
| Unique (%) | 77.4% |
| Missing | 36 |
| Missing (%) | 4.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1428.3031123139378 |
|---|---|
| Minimum | 478.0 |
| Maximum | 2494.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 478 |
|---|---|
| 5-th percentile | 694.9 |
| Q1 | 1190 |
| median | 1452 |
| Q3 | 1652.5 |
| 95-th percentile | 2072.3 |
| Maximum | 2494 |
| Range | 2016 |
| Interquartile range (IQR) | 462.5 |
Descriptive statistics
| Standard deviation | 391.1805807 |
|---|---|
| Coefficient of variation (CV) | 0.2738778466 |
| Kurtosis | 0.03770518926 |
| Mean | 1428.303112 |
| Median Absolute Deviation (MAD) | 238 |
| Skewness | -0.04626757492 |
| Sum | 1055516 |
| Variance | 153022.2468 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1524 | 4 | 0.5% | |
| 1467 | 4 | 0.5% | |
| 1246 | 4 | 0.5% | |
| 1698 | 4 | 0.5% | |
| 1491 | 4 | 0.5% | |
| 1421 | 3 | 0.4% | |
| 1614 | 3 | 0.4% | |
| 1648 | 3 | 0.4% | |
| 1590 | 3 | 0.4% | |
| 1046 | 3 | 0.4% | |
| Other values (562) | 704 | 90.8% | |
| (Missing) | 36 | 4.6% |
| Value | Count | Frequency (%) | |
| 478 | 1 | 0.1% | |
| 490 | 1 | 0.1% | |
| 505 | 1 | 0.1% | |
| 517 | 1 | 0.1% | |
| 534 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 2494 | 1 | 0.1% | |
| 2485 | 1 | 0.1% | |
| 2481 | 2 | 0.3% | |
| 2421 | 1 | 0.1% | |
| 2390 | 1 | 0.1% |
| Distinct count | 319 |
|---|---|
| Unique (%) | 77.8% |
| Missing | 365 |
| Missing (%) | 47.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 161.32682926829267 |
|---|---|
| Minimum | 99.8 |
| Maximum | 232.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 99.8 |
|---|---|
| 5-th percentile | 104.28 |
| Q1 | 132.225 |
| median | 143.9 |
| Q3 | 204.275 |
| 95-th percentile | 225.52 |
| Maximum | 232.5 |
| Range | 132.7 |
| Interquartile range (IQR) | 72.05 |
Descriptive statistics
| Standard deviation | 40.02093538 |
|---|---|
| Coefficient of variation (CV) | 0.24807365 |
| Kurtosis | -1.32839534 |
| Mean | 161.3268293 |
| Median Absolute Deviation (MAD) | 32.25 |
| Skewness | 0.2130596817 |
| Sum | 66144 |
| Variance | 1601.675269 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 142.4 | 7 | 0.9% | |
| 137.9 | 5 | 0.6% | |
| 141.3 | 5 | 0.6% | |
| 141 | 5 | 0.6% | |
| 99.9 | 4 | 0.5% | |
| 112.7 | 3 | 0.4% | |
| 140.3 | 3 | 0.4% | |
| 141.1 | 3 | 0.4% | |
| 184.4 | 3 | 0.4% | |
| 211 | 3 | 0.4% | |
| Other values (309) | 369 | 47.6% | |
| (Missing) | 365 | 47.1% |
| Value | Count | Frequency (%) | |
| 99.8 | 1 | 0.1% | |
| 99.9 | 4 | 0.5% | |
| 100 | 2 | 0.3% | |
| 100.1 | 1 | 0.1% | |
| 100.4 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 232.5 | 1 | 0.1% | |
| 232.1 | 2 | 0.3% | |
| 231.7 | 1 | 0.1% | |
| 231.3 | 1 | 0.1% | |
| 231.2 | 1 | 0.1% |
| Distinct count | 222 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 553 |
| Missing (%) | 71.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 440093.990990991 |
|---|---|
| Minimum | 244399.0 |
| Maximum | 684482.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.1 KiB |
Quantile statistics
| Minimum | 244399 |
|---|---|
| 5-th percentile | 252704.75 |
| Q1 | 331524.25 |
| median | 444576 |
| Q3 | 545365.75 |
| 95-th percentile | 622650.65 |
| Maximum | 684482 |
| Range | 440083 |
| Interquartile range (IQR) | 213841.5 |
Descriptive statistics
| Standard deviation | 124743.7025 |
|---|---|
| Coefficient of variation (CV) | 0.2834478659 |
| Kurtosis | -1.161382851 |
| Mean | 440093.991 |
| Median Absolute Deviation (MAD) | 103266.5 |
| Skewness | -0.08536787214 |
| Sum | 97700866 |
| Variance | 1.556099132e+10 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 369315 | 1 | 0.1% | |
| 609170 | 1 | 0.1% | |
| 465592 | 1 | 0.1% | |
| 590189 | 1 | 0.1% | |
| 257737 | 1 | 0.1% | |
| 557411 | 1 | 0.1% | |
| 544096 | 1 | 0.1% | |
| 402091 | 1 | 0.1% | |
| 250965 | 1 | 0.1% | |
| 300200 | 1 | 0.1% | |
| Other values (212) | 212 | 27.4% | |
| (Missing) | 553 | 71.4% |
| Value | Count | Frequency (%) | |
| 244399 | 1 | 0.1% | |
| 245226 | 1 | 0.1% | |
| 247981 | 1 | 0.1% | |
| 248808 | 1 | 0.1% | |
| 248929 | 1 | 0.1% |
| Value | Count | Frequency (%) | |
| 684482 | 1 | 0.1% | |
| 681263 | 1 | 0.1% | |
| 676135 | 1 | 0.1% | |
| 674457 | 1 | 0.1% | |
| 670757 | 1 | 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| DATE | uspop_growth | med_hIncome | rentl_vacnyRate | unemplt_rate | int_rate | cpi_rent | homePrice_index | newHouse_starts | ppi_resConstruct | resConstruct_spending | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1956-01-01 | NaN | NaN | 6.2 | 4.0 | 2.50 | 35.9 | NaN | NaN | NaN | NaN |
| 1 | 1956-02-01 | NaN | NaN | 6.2 | 3.9 | 2.50 | 35.9 | NaN | NaN | NaN | NaN |
| 2 | 1956-03-01 | NaN | NaN | 6.2 | 4.2 | 2.50 | 35.9 | NaN | NaN | NaN | NaN |
| 3 | 1956-04-01 | NaN | NaN | 5.9 | 4.0 | 2.65 | 36.0 | NaN | NaN | NaN | NaN |
| 4 | 1956-05-01 | NaN | NaN | 5.9 | 4.3 | 2.75 | 36.1 | NaN | NaN | NaN | NaN |
| 5 | 1956-06-01 | NaN | NaN | 5.9 | 4.3 | 2.75 | 36.2 | NaN | NaN | NaN | NaN |
| 6 | 1956-07-01 | NaN | NaN | 6.3 | 4.4 | 2.75 | 36.4 | NaN | NaN | NaN | NaN |
| 7 | 1956-08-01 | NaN | NaN | 6.3 | 4.1 | 2.81 | 36.4 | NaN | NaN | NaN | NaN |
| 8 | 1956-09-01 | NaN | NaN | 6.3 | 3.9 | 3.00 | 36.5 | NaN | NaN | NaN | NaN |
| 9 | 1956-10-01 | NaN | NaN | 5.8 | 3.9 | 3.00 | 36.5 | NaN | NaN | NaN | NaN |
Last rows
| DATE | uspop_growth | med_hIncome | rentl_vacnyRate | unemplt_rate | int_rate | cpi_rent | homePrice_index | newHouse_starts | ppi_resConstruct | resConstruct_spending | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 765 | 2019-10-01 | 0.473954 | NaN | 6.4 | 3.6 | 2.25 | 334.680 | 212.165 | 1340.0 | 227.5 | 563877.0 |
| 766 | 2019-11-01 | 0.473954 | NaN | 6.4 | 3.5 | 2.25 | 335.819 | 212.300 | 1371.0 | 226.9 | 574079.0 |
| 767 | 2019-12-01 | 0.473954 | NaN | 6.4 | 3.5 | 2.25 | 336.789 | 212.413 | 1587.0 | 226.7 | 579863.0 |
| 768 | 2020-01-01 | 0.473954 | NaN | 6.6 | 3.6 | 2.25 | 337.825 | 212.470 | 1617.0 | 228.0 | 596728.0 |
| 769 | 2020-02-01 | 0.473954 | NaN | 6.6 | 3.5 | 2.25 | 338.616 | 213.255 | 1567.0 | 227.3 | 600581.0 |
| 770 | 2020-03-01 | 0.473954 | NaN | 6.6 | 4.4 | 0.25 | 339.519 | 215.160 | 1269.0 | 224.5 | 595963.0 |
| 771 | 2020-04-01 | 0.473954 | NaN | 5.7 | 14.7 | 0.25 | 340.135 | 217.323 | 934.0 | 215.9 | 569892.0 |
| 772 | 2020-05-01 | 0.473954 | NaN | 5.7 | 13.3 | 0.25 | 340.811 | 218.600 | 1038.0 | 217.3 | 549977.0 |
| 773 | 2020-06-01 | 0.473954 | NaN | 5.7 | 11.1 | 0.25 | 341.294 | 219.819 | 1220.0 | 221.4 | 542307.0 |
| 774 | 2020-07-01 | 0.473954 | NaN | 5.7 | 10.2 | 0.25 | 341.950 | NaN | 1496.0 | 225.3 | NaN |